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Abstract. We have developed a specialized computational in- 
strument for fitting models of pulsating white dwarfs to observations 
made with the Whole Earth Telescope. This metacomputer makes 
use of inexpensive PC hardware and free software, including a paral- 
lel genetic algorithm which performs a global search for the best-fit 
set of parameters. 
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1. INTRODUCTION 

White dwarf asteroseismology offers the opportunity to probe 
the structure and composition of stellar objects governed by rela- 
tively simple principles. The observational requirements of aster- 
oseismology have been addressed by the development of the Whole 
Earth Telescope (WET), but the analytical procedures need to be re- 
fined to take full advantage of the possibilities afforded by the WET 
data. 

The adjustable parameters in our computer models of white 
dwarfs presently include the total mass, the temperature, the H and 
He layer masses, the core composition, and the transition zone thick- 
nesses. Finding a proper set of these to provide a close fit to the 
observed data is difficult. The current procedure is a cut-and-try 
process guided by intuition and experience, and is far more sub- 
jective than we would like. Objective procedures for determining 
the best-fit model are essential if asteroseismology is to become a 
widely-accepted and reliable astronomical technique. We must be 
able to demonstrate that, within the range of different values the 
model parameters can assume, we have found the only solution, or 
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the best one if more than one is possible. To address this problem, 
we are applying a search-and-fit technique employing a genetic algo- 
rithm (GA), which can explore the myriad parameter combinations 
possible and select for us the best one, or ones (cf. Goldberg 1989, 
Charbonneau 1995, Metcalfe & Nather 1999). 

Although genetic algorithms are more efficient than other com- 
parably global techniques, they are still quite demanding computa- 
tionally. To be practical, the GA-based fitting technique requires 
a dedicated instrument to perform the calculations. Over the past 
year, we have designed and configured such an instrument — an iso- 
lated network of 64 minimal PCs running Linux. Since the structure 
of a GA is very conducive to parallelization, this metacomputer al- 
lows us to run our code much faster than would otherwise be possible. 

2. HARDWARE 

In January 1998, around the time that the idea of commodity 
parallel processing started getting a lot of attention, we were inde- 
pendently designing a metacomputer of our own. Our budget was 
modest, so we set out to get the best performance possible per dollar 
without restricting the ability of the machine to solve our specific 
problem. 

The original Beowulf cluster (Becker et al. 1995), which we didn't 
know about at the time, had a number of features which, though 
they contributed to the utility of the machine as a multi-purpose 
computational tool, were unnecessary for our particular problem. 
We wanted to use each node of the metacomputer to run identical 
tasks with small, independent sets of data. The results of the cal- 
culations performed by the nodes consisted of just a few numbers 
which only needed to be communicated to the master process, never 
to another node. Essentially, network bandwidth was not an issue 
because the computation to communication ratio of our application 
was extremely high, and hard disks were not needed on the nodes 
because our problem did not require any significant amount of data 
swapping. In the end we settled on a design including one master 
server augmented by minimal nodes connected by a simple 10base-2 
network (see Figure 1). 

The master computer is a Pentium-II 333 MHz system with three 
NE-2000 compatible network cards, each of which drives 1/3 of the 
nodes on a subnet. Since a single ethernet card can handle up to 30 
devices, no repeater was necessary. 
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Fig. 1. The 64 minimal nodes of the metacomputer on shelves sur- 
rounding the master computer. 



The slave nodes were assembled from components obtained at a 
discount computer outlet. Each node consists of an ATX tower case 
with a motherboard, processor and fan, a single 32 MB SDRAM, 
and an NE-2000 compatible network card with a custom made boot- 
EPROM. The nodes are connected in series with 3-ft ethernet coaxial 
cables. Half of the nodes contain Pentium-II 300 MHz processors, 
while the other half are AMD K6-II 450 MHz chips. The total cost 
of the system was around $25k, but it could be built for considerably 
less today, and less still tomorrow. 

3. SOFTWARE 

To make the metacomputer work, we relied on the open-source 
Linux operating system and software tools. We programmed the 
EPROMs with Gero Kuhlmann's NETBOOT package to allow each 
node to download and mount an independent Linux filesystem on 
a small ramdisk partition. We used Tom Fawcett's YARD package 
to create the filesystem, and we included in it a pared down version 
of the PVM software developed at Oak Ridge National Laboratory 
(Geist et al. 1994). 

Finally, we incorporated the message passing routines of the 
PVM library into PIKAIA, a general purpose public-domain GA 
developed by Charbonneau (1995), and we modified the white dwarf 
evolution and pulsation codes (see Wood 1990, Bradley 1993, Mont- 
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gomery 1998) to allow reliable and automated calculation of the nor- 
mal modes of oscillation for white dwarf stars with a wide range of 
masses, temperatures, and other parameters. 

4. BENCHMARKS 

Measuring the absolute performance of the metacomputer is dif- 
ficult because the result strongly depends on the fraction of Floating- 
point Division operations (FDIVs) used in the benchmark code. Ta- 
ble 1 lists four different measures of the absolute speed in Millions 
of FLoating-point Operations Per Second (MFLOPS). 



Table 1. The absolute speed of the metacomputer. 



Benchmark 


P-II 300 MHz 


K6-II 450 MHz 


Total Speed 


MFLOPS(l) 


80.6 


65.1 


4662.4 


MFLOPS(2) 


47.9 


67.7 


3699.2 


MFLOPS(3) 


56.8 


106.9 


7056.0 


MFLOPS(4) 


65.5 


158.9 


7180.8 



The code for MFLOPS(l) is essentially scalar — that is, vec- 
tor processor performance will reflect scalar performance which 
will lie far below expected vector performance. Also, the percent- 
age of FDIVs (9.6%) is considered somewhat high. The code for 
MFLOPS (2) is fully vectorizable. The percentage of FDIVs (9.2%) 
is still somewhat on the high side. The code for MFLOPS (3) is also 
fully vectorizable. The percentage of FDIVs (3.4%) is considered 
moderate. The code for MFLOPS (4) is fully vectorizable, but the 
percentage of FDIVs is zero. 

We feel that MFLOPS(3) provides the best measure of the ex- 
pected performance for the white dwarf code, because of the moder- 
ate percentage of FDIVs. Adopting this value, we have achieved a 
price to performance ratio near S3.50/MFLOPS. 
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